Current data includes 454 assemblies for Pectobacterium
genus downloaded from NCBI Assembly database and internal assemblies
provided by the collaborators. Metadata for all NCBI assemblies was
downloaded as XML from Assembly and BioSample databases using Eutils
tools provided by NCBI. Additionally, linked BioSample metadata was
fetched from NCBI and this combined data is summarized in the figures
below.
NCBI performs internal QC on the genome assemblies submitted by the users. A genome assembly can be excluded from NCBI because of multiple possible reasons (see here for details). For the current dataset, 0 genome assemblies are with know issues. Sunburst chart in the second column of current row shows the assembly counts which were flagged by NCBI for one of the QC metric mentioned below. (innermost to outermost order):
Additionally, NCBI also performs the taxonomy validation for prokaryote genomes. It uses Average Nucleotide Identity (ANI) scores to verify the declared species for any genome submitted. The details about the method are described in Cuifo et al 2018. Bar chart in the third column of current row shows the statistics for different taxonomy check status.
A species can have multiple type strain genomes available in NCBI. Following table summarizes all the type strains for Pectobacterium genus available in the NCBI Assembly database.
Following plots show the quantitative data such as N50, L50, contig counts and BUSCO score for genomes.